Goto

Collaborating Authors

 normal equation


Supplementary Material A Experimentation Details

Neural Information Processing Systems

A.1 Source code Upon request, we will provide an anonymized version of our code in the rebuttal. We replicated our experiments using the codebase provided by Shah et al. [ 2022 ], which can be found at github . To ensure consistency, we used the same hyperparameters as mentioned in the code or article for the baselines. This helps ensure the stability of metric learning. We initialize the parameters in such a way that the predicted metric is close to the Euclidean metric.


Selective Forgetting in Option Calibration: An Operator-Theoretic Gauss-Newton Framework

arXiv.org Artificial Intelligence

Modern financial models are not static; they are recalibrated as market conditions change. Therefore calibrating parametric asset-pricing models to market data has always been an ongoing interest for both practitioners and academics in the field of mathematical finance. Risk management systems along with trading desks rely heavily on the repeated solutions of inverse problems aimed at calibrating and adjusting parameters ฮธ so that the model-based prices m(x;ฮธ) reproduce observed quotes to some extent of accuracy. Option-implied volatility surfaces evolve minute by minute, and model parameters such as mean reversion, volatility of volatility, or correlation etc. are adapted to new market information.



Overcoming the Loss Conditioning Bottleneck in Optimization-Based PDE Solvers: A Novel Well-Conditioned Loss Function

arXiv.org Machine Learning

Optimization-based PDE solvers that minimize scalar loss functions have gained increasing attention in recent years. These methods either define the loss directly over discrete variables, as in Optimizing a Discrete Loss (ODIL), or indirectly through a neural network surrogate, as in Physics-Informed Neural Networks (PINNs). However, despite their promise, such methods often converge much more slowly than classical iterative solvers and are commonly regarded as inefficient. This work provides a theoretical insight, attributing the inefficiency to the use of the mean squared error (MSE) loss, which implicitly forms the normal equations, squares the condition number, and severely impairs optimization. To address this, we propose a novel Stabilized Gradient Residual (SGR) loss. By tuning a weight parameter, it flexibly modulates the condition number between the original system and its normal equations, while reducing to the MSE loss in the limiting case. We systematically benchmark the convergence behavior and optimization stability of the SGR loss within both the ODIL framework and PINNs-employing either numerical or automatic differentiation-and compare its performance against classical iterative solvers. Numerical experiments on a range of benchmark problems demonstrate that, within the ODIL framework, the proposed SGR loss achieves orders-of-magnitude faster convergence than the MSE loss. Further validation within the PINNs framework shows that, despite the high nonlinearity of neural networks, SGR consistently outperforms the MSE loss. These theoretical and empirical findings help bridge the performance gap between classical iterative solvers and optimization-based solvers, highlighting the central role of loss conditioning, and provide key insights for the design of more efficient PDE solvers.


Event-based Photometric Bundle Adjustment

arXiv.org Artificial Intelligence

Abstract--We tackle the problem of bundle adjustment (i.e., simultaneous refinement of camera poses and scene map) for a purely rotating event camera. Starting from first principles, we formulate the problem as a classical non-linear least squares optimization. The photometric error is defined using the event generation model directly in the camera rotations and the semi-dense scene brightness that triggers the events. We leverage the sparsity of event data to design a tractable Levenberg-Marquardt solver that handles the very large number of variables involved. To the best of our knowledge, our method, which we call Event-based Photometric Bundle Adjustment (EPBA), is the first event-only photometric bundle adjustment method that works on the brightness map directly and exploits the spacetime characteristics of event data, without having to convert events into image-like representations. Comprehensive experiments on both synthetic and real-world datasets demonstrate EPBA's effectiveness in decreasing the photometric error (by up to 90%), yielding results of unparalleled quality. The refined maps reveal details that were hidden using prior state-of-the-art rotation-only estimation methods. The experiments on modern high-resolution event cameras show the applicability of EPBA to panoramic imaging in various scenarios (without map initialization, at multiple resolutions, and in combination with other methods, such as IMU dead reckoning or previous event-based rotation estimation methods). We make the source code publicly available.


Event-based Mosaicing Bundle Adjustment

arXiv.org Artificial Intelligence

We tackle the problem of mosaicing bundle adjustment (i.e., simultaneous refinement of camera orientations and scene map) for a purely rotating event camera. We formulate the problem as a regularized non-linear least squares optimization. The objective function is defined using the linearized event generation model in the camera orientations and the panoramic gradient map of the scene. We show that this BA optimization has an exploitable block-diagonal sparsity structure, so that the problem can be solved efficiently. To the best of our knowledge, this is the first work to leverage such sparsity to speed up the optimization in the context of event-based cameras, without the need to convert events into image-like representations. We evaluate our method, called EMBA, on both synthetic and real-world datasets to show its effectiveness (50% photometric error decrease), yielding results of unprecedented quality. In addition, we demonstrate EMBA using high spatial resolution event cameras, yielding delicate panoramas in the wild, even without an initial map.


Moment Estimation for Nonparametric Mixture Models Through Implicit Tensor Decomposition

arXiv.org Machine Learning

We present an alternating least squares type numerical optimization scheme to estimate conditionally-independent mixture models in $\mathbb{R}^n$, without parameterizing the distributions. Following the method of moments, we tackle an incomplete tensor decomposition problem to learn the mixing weights and componentwise means. Then we compute the cumulative distribution functions, higher moments and other statistics of the component distributions through linear solves. Crucially for computations in high dimensions, the steep costs associated with high-order tensors are evaded, via the development of efficient tensor-free operations. Numerical experiments demonstrate the competitive performance of the algorithm, and its applicability to many models and applications. Furthermore we provide theoretical analyses, establishing identifiability from low-order moments of the mixture and guaranteeing local linear convergence of the ALS algorithm.


Working with QR Decomposition part3(Machine Learning)

#artificialintelligence

Abstract: The CP tensor decomposition is used in applications such as machine learning and signal processing to discover latent low-rank structure in multidimensional data. Computing a CP decomposition via an alternating least squares (ALS) method reduces the problem to several linear least squares problems. The standard way to solve these linear least squares subproblems is to use the normal equations, which inherit special tensor structure that can be exploited for computational efficiency. However, the normal equations are sensitive to numerical ill-conditioning, which can compromise the results of the decomposition. In this paper, we develop versions of the CP-ALS algorithm using the QR decomposition and the singular value decomposition (SVD), which are more numerically stable than the normal equations, to solve the linear least squares problems.


Gradient Descent

#artificialintelligence

Understanding the concept of the gradient is useful for understanding the logic of the gradient descent algorithm. Let's take a look at the explanation of the concept of stationary point in Wikipedia. As it can be understood from here, the gradient descent algorithm takes the points in the cost function and continues with the aim of reducing the derivative (slope) of these points in each iteration. The reason for this is to find the value whose slope is zero, in other words, the minimum point. When the coordinate values of this point are substituted in the hypothesis function, the function we obtain becomes the hypothesis function of the model with the least error we can create.


Normal Equation

#artificialintelligence

In machine learning, various optimization techniques can be used to reduce the error and thus increase the accuracy rate. In this article, we will discuss the optimization of machine learning models with the normal equation method, and in order to understand this, we should first take a look at the concept of the cost function. The cost function, although it has different variations (see MAE, RMSE, MSE), basically contains 2 variables (y_real, y_predicted); It allows us to measure the error, in other words, the difference between the actual output values and the predicted output values in machine learning models. As can be seen in the figure, the sum of the squares of the differences between the y values estimated as a result of the hypothesis function and the actual y values gives us the root squared cost function. If you want to learn more about this concept, you can reach my article on this subject from the link below.